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Abstract. MRI quantification of cranial nerves such as anterior visual 
pathway (AVP) in MRI is challenging due to their thin small size, struc¬ 
tural variation along its path, and adjacent anatomic structures. Segmen¬ 
tation of pathologically abnormal optic nerve (e.g. optic nerve glioma) 
poses additional challenges due to changes in its shape at unpredictable 
locations. In this work, we propose a partitioned joint statistical shape 
model approach with sparse appearance learning for the segmentation 
of healthy and pathological AVP. Our main contributions are: (1) opti¬ 
mally partitioned statistical shape models for the AVP based on regional 
shape variations for greater local flexibility of statistical shape model; 
(2) refinement model to accommodate pathological regions as well as ar¬ 
eas of subtle variation by training the model on-the-fly using the initial 
segmentation obtained in (1); (3) hierarchical deformable framework to 
incorporate scale information in partitioned shape and appearance mod¬ 
els. Our method, entitled PAScAL (PArtitioned Shape and Appearance 
Learning), was evaluated on 21 MRI scans (15 healthy + 6 glioma cases) 
from pediatric patients (ages 2-17). The experimental results show that 
the proposed localized shape and sparse appearance-based learning ap¬ 
proach significantly outperforms segmentation approaches in the analysis 
of pathological data. 
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1 Introduction 

MRI is a widely used non-invasive technique for studying and characterizing 
diseases of the optic pathway such as optic neuritis, multiple sclerosis, and optic 
pathway glioma (OPG) [I]. OPGs are low grade astrocytomas inherent to the 
AVP (i.e., optic nerve, chiasm and tracts). OPGs occur in 20% of children with 
neurofibromatosis type 1 (NF1), a very common genetic disorder that carries 
increased risk of tumors in the nervous system. The disease course is variable, 
as these tumors may demonstrate several distinct periods of growth, stability 
or regression. Currently, no quantitative imaging criteria exist to define OPGs 
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secondary to NF1. Non-invasive computer-aided quantification of these changes 
can not only eliminate excessive physicians effort to segment these regions but 
also increases the precision of volume measures. However, automatic segmenta¬ 
tion of cranial nerve pathways including AVP from MRI is challenging due to 
their thin-long shape and varying appearances. A few non-invasive automated 
methods to segment AVP from radiological images have been reported in the lit¬ 
erature previously with modest success. Bekes et al. [2] proposed a geometrical 
model based approach; however, their approach’s reproducibility is found to be 
less than 50%. Noble et al. [Sj presented a hybrid approach using a deformable 
model with level set method to segment the optic nerves and the chiasm; how¬ 
ever, the method was tested only on healthy cases. Recently, Yang et al. [4] 
developed a partitioned approach to healthy AVP segmentation by dividing the 
pathway into various shape homogenous segments and modeling each segment 
independently. The local appearance information in their approach was encoded 
using the normalized derivatives, three class fuzzy c-means, and spherical flux. 
The approach was the first attempt to accommodate local shape and appearance 
variation for healthy AVP segmentation; the method, although promising, did 
not provide any objective criteria on the optimal number of partitions. More¬ 
over, the approach did not accommodate local appearance characteristics along 
the nerve boundary that are particularly important in pathological cases. 

Depending on severity, pathological AVPs can have a drastically different 
local shape and appearance characteristics than healthy ones, thus failing the 
shape model based segmentation methods in cranial nerve pathways. To illus¬ 
trate, Fig [Ha) demonstrates a healthy optic nerve along with a contralateral 
optic nerve having OPG. Fig[T|(b)-(c) show the renderings of cases with OPG in 
optic nerve region. In this paper, we propose, PAScAL , an optimally partitioned 
statistical shape model with sparse appearance learning for the segmentation of 
AVPs for both healthy and pathological cases. The challenge of segmenting larger 
anatomical structures with pathologies have been addressed numerously in the 
literature [5]. However, development of similar approaches for smaller vascular 
structures, such as the AVP, have traditionally been ignored. By illustrating the 
robustness of PAScAL to segment AVP with OPG, we demonstrate the appli¬ 
cability of the proposed method in segmenting other anatomical structures of 
similar characteristics. 


2 Methods 

We propose a hierarchical joint partitioned shape model and sparse appearance 
learning to automatically segment the AVP from MRI scans of the head. During 
the training stage automatically selected landmarks from healthy cases are 
first clustered into various shape-consistent overlapping partitions thus creating 
individual simplistic shape and appearance models for each partition. The in¬ 
dividually learned models are used to produce the initial segmentation of AVP 
using the partitioned active shape model (ASM p ) described in Section [2721 In 
the testing stage, the learned ASM p is iteratively fitted to new data using the 


Partitioned Joint Shape Modeling with Sparse Appearance Learning 


3 



Fig. 1: (a) MRI scan with a healthy (left) and a gliomic (right) optic nerve. The 
maximum diameter of OPG nerve is 9.54 mm and 1.15 mm for the healthy nerve of 
the same patient, (b)-(c) renderings of typical OPG cases in the optic nerve, (b) shows 
OPG in the distal region of left optic nerve, (c) shows one in the proximal region, (d) 
Shape consistent partitioning of a healthy AVP produced by PAScAL. 
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Fig. 2: Flow diagram of the PAScAL approach to optic nerve segmentation. 


appearance guided model. A refinement stage follows to accommodate local ap¬ 
pearance features particularly important in cases with pathologies (e.g. OPG): 
a sparse local appearance dictionary is learned on-the-fly from the testing image 
for each partition using the initial segmentation as training data acquired from 
the test image in real-time. Through these steps, PAScAL is adapting to each 
testing set to compensate for the difficulties with off-line training for patholog¬ 
ical cases due to the unpredictable location, shape, and appearance of OPG. 
PAScAL is summarized in Fig. [2] Details of the proposed method are provided 
in the subsequent sections. 


2.1 Shape consistent agglomerative hierarchical landmark 
partitioning 

In the beginning, the annotated landmarks are grouped by using a modification 
of the agglomerative hierarchical clustering method proposed by Cerrolaza et 
al. [8], minimizing the following objective function: 
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where S is the set of all landmarks over the AVP and h? C S denotes the local 
shape to be sub-partitioned into optimal set of clusters. V q denotes the domi¬ 
nant direction in f2. V; is the deformation vector for landmark l obtained through 
well known point distribution model by Cootes et. al [7] over S. a £ [0,1] is the 
coefficient that controls the relative weights (a is set to 0.8 in our experiments) 
and A max = max{||V)||}. We define the optimal number of partitions based on 

shape similarities calculated using a tailored Silhouette coefficient score. Specifi¬ 
cally, let f2 p denotes the set of landmarks for the shape partition p containing the 
landmark l and f2 p _i denotes the set of landmarks for the same shape p with land¬ 
mark excluded then the contribution of the landmark l in partition p is defined as 
a Pt i = — € {0,1}. A large a Pt i denotes higher dissimilarity between 

the landmark l and the shape f2 p . The cost of including landmark l to a partition 
p is similarly defined as b p j = J(fi p+ i) — J(f7 p ). Then the optimal number of 


+ . + . run ... . . 1 fp(Pl) ~ fp( a i) 

partitions p 0 „ t are found by maximizing: maximize t-t > - . „ . — . „ ,, ,, , 

|Z| “ max(/p(az),/ p (&z)) 

where /(.) is the logistic sigmoid function, |Z| is the total number of landmarks. 
To ensure that adjacent partitions are connected, an overlapping region is in¬ 
troduced by sharing the boundary landmarks of these partitions. During the 
shape model fitting, the shape parameters of the overlapping landmarks are cal¬ 
culated using the parameters of the overlapping shapes. Fig. [TJd) demonstrates 
the proposed agglomerative hierarchical landmark partitioning approach. 


2.2 Landmark weighted partitioned active shape model fitting 

Once the shape partitions are generated, ASM p is performed on the individual 
shapes in the partitioned hyperspace. In order to adapt to local appearance char¬ 
acteristics, following set of appearance features are used to create overlapping 
partitioned statistical appearance models for each partition: (i) the intensities 
of neighboring voxels of each landmark, (ii) the three-class fuzzy c-means fil¬ 
ter to robustly delineate both tissues in dark as well as bright foregrounds (as 
explained before, the AVP passes through neighboring tissues of varying con¬ 
trasts), and (iii) spherical flux to exploit the vessel-like characteristics. AVP has 
varying contrast in different regions (i.e, fatty regions has better contrast ap¬ 
pearance with optic nerve than gray matter) thus we assigned different levels of 
confidence for the reliability of landmarks. Specifically, for each landmark in the 
training set, the covariance U of these features is calculated across the training 
examples under the assumption that the lower the variance of the appearance 
profile of a landmark, the higher would be our confidence in the landmark. The 
weight ivi of a landmark l can therefore be calculated as: Wi = ( 1+t *(.Si)) ; where 
fr() denotes the trace of a matrix. The shape parameters for a partition p can 
be computed as b p = (PpWpipp) <PpWp ( x p — x£), where ip p is the eigenvector 
matrix, x p is the aligned training shape vector, x p is the mean shape vector, and 
W p is the diagonal weight matrix of landmarks belonging partition p. 




Partitioned Joint Shape Modeling with Sparse Appearance Learning 


5 


2.3 On-the-fly sparse appearance learning 


Pathologies can result in changes in shape and appearance of AVP at unpre¬ 
dictable locations (Fig. CD- Statistical shape models have been very successful in 
segmenting healthy organs; however, they struggle to accommodate cases where 
the shape of the target structure cannot be predicted through training, such as in 
the cases of OPG. Feature-based approaches have demonstrated superior perfor¬ 
mance in segmentation of pathological tissues however, off-line feature-based 
training of pathological cases mostly fails due to large variations, in both shape 
and appearance, for pathological cases. To address these challenges, we present a 
novel on-the-fly learning approach by using the initial delineation of the test im¬ 
age obtained in the previous section as training to learn an appearance dictionary 
in real-time. Specifically, let R v (p) be a m x m x k image patch extracted from 
within the initial partition p centered at voxel v £ R 3 . Equal number of patches 
are extracted from each partition. The 2D co-occurrence matrix on every slice of 
the patch is then calculated from Ri pi ( p ) and the following gray-level features 
are extracted: (1) autocorrelation, (2) contrast, (3) cluster shade, (4) dissimilar¬ 
ity, (5) energy, (6) entropy, (7) variance, (8) homogeneity, (9) correlation, (10) 
cluster prominence, and (11) inverse difference. To reduce the redundancy in 
the features, we use k-SVD dictionary learning j8j. A dictionary D p for every 
partition p G P is learned. Specifically, we begin by extracting the centerline of 
the initial ASM p segmentation using the shortest path graph. Afterwards, we 
choose the point c pp on the centerline that is closest to the landmark l pp in l 2 - 
norm sense. Subsequently, co-occurrence features are extracted from the patch 
R Cpi (p). The likelihood of voxels belonging to the optic nerve is determined 
by using sparse representation classification (SRC) [9j. In SRC framework, the 

classification problem is formulated as: argmin||/' — D p (3 1| 2 + AH/?!^, where f 

0 


is the discriminative feature representation of the testing voxel, /3 is the sparse 
code for the testing voxel, A is the coefficient of sparsity, and r p = /' — D P /3 P 
is the reconstruction residue of the sparse reconstruction. The likelihood h of a 
testing voxel y is calculated with the indicator function h(v) with h(y ) = 1 if 


r p < r y+i and —1 otherwise, r p is the reconstruction residue at testing voxel 
y and r^ +1 is the reconstruction residue at the neighboring next voxel to y 
in the normal direction outwards from the centerline. To move landmark l p ^ 
on the surface of the segmentation, we search in the normal direction. A po¬ 
sition with the most similar profile pattern to the boundary pattern is cho¬ 
sen as the new position of the landmark using the following objective function, 


argmax 
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range, N Cpi is outward normal direction at point c Pj j, 8 is the position off-set 
to be optimized and Pi-i \ \ is the desired boundary pattern. The length of the 
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boundary pattern \h\ is desirable to be maximized to mitigate the effects of noise 
and false positives in the pattern. 

2.4 Hierarchical segmentation 

In order to enhance the robustness of the proposed method, we adopted a hier¬ 
archical segmentation approach by incorporating scale dependent information. 
The idea is that the coarser levels handles robustness while the finer-scale con¬ 
centrates on the accuracy of the boundary. The segmentation at a coarser scale 
is subsequently used to initialize the finer scale. To achieve the hierarchical joint 
segmentation the following steps are adopted: (1) The number of shape parti¬ 
tions are dyadically increased from the coarsest to the finest scale. The number 
of partitions rij at the coarser scales j are calculated as: rij = [~2where 
G J is the number of partitions at the finest scale J. (2) The patch size used 
to calculate the appearance features (Section 12.311 are dyadically decreased from 
coarser to finer scales. 

3 Results 

After Institutional Review Board approval, 15 pediatric MRI scans with healthy 
AVPs and 6 with OPG were acquired for this study. The acquired data were 
T1 weighted cube with Gadolinium contrast enhancement having spatial resolu¬ 
tion between 0.39 x 0.39 x 0.6mm 3 to 0.47 x 0.47 x 0.6mm 3 . The manual ground 
truth for optic pathway segmentation was created by an expert neuro-radiologist 
and an expert neuro-ophthalmologist. During the training stage, the dataset 
was affinely registered to a randomly chosen reference image using a two-stage 
hierarchical approach: first by optimizing the registration parameters for the en¬ 
tire brain and later by optimizing over the region of interest around the optic 
nerve. The surfaces for each training instance were computed using the tetrahe¬ 
dral mesh generation approach followed by point set registration to the reference 
surface. Based on our training set, optimal number of partitions were found to 
be 12. Three hierarchical scales for shape model and appearance were used. The 
refinement model was learned on-the-fly from the initial segmentation using a 
patch of size llxllxll voxels at the coarsest level. The normalized derivative, 
the tissue intensity probability, and the tubular structure probability were used 
together as a unified feature set of size 33 to train the refinement model. To 
learn the sparse dictionary, co-occurrence features were extracted with an offset 
of 1 and four directions (0, f, j, The co-occurrence features presented in 
Section [?T3l are then calculated for each direction. During the testing stage, 
the test image was first registered to the randomly selected reference set followed 
by automatic overlapping partitioning. The mean shape of the training set was 
used to initialize the shape model. Fig.[3]shows the qualitative results of PAScAL 
against the ground truth manual segmentation. 

For quantitative evaluation, the Dice similarity coefficient (DSC) and Haus- 
dorff distance (HD) were calculated between the segmentation obtained using 



Fig. 3: Segmentation results for a representative healthy (left) and OPG case (right). 
Blue label shows overlap area of manual and automated segmentation, red label shows 
the manual label while the green label shows the automated segmentation. 

PAScAL and the expert generated ground truth. The quantitative results based 
on the leave-one-out evaluation are reported in Fig. [4] An average DSC of 0.32 for 
ASM, 0.53 for Yang et al.’s approach [4], and 0.68 for PAScAL is obtained, show¬ 
ing significant improvement by PAScAL over both methods (p-value (Wilcoxon 
signed rank test): ASM=< 0.001, Yang’s partitioned ASM=0.015). 


3.1 Automatic optic pathway glioma detection 

The demonstrated of the AVP is used to establish the clinical biomarker of the 
OPG based on the radius profile of the optic nerve. Specifically, the average 
radius of the optic nerve only (ref. Fig. |T] (c)) is calculated along the center-line 
of the training data set for healthy and OPG cases. A statistically significant 
difference between the average radii of the two classes was found based on the 
ground truth data (healthy optic nerve (0.401 ± 0.050mm), optic nerve with 
OPG (0.800 ± 0.293mm), p-value< 0.001). No significant correlation between 
the average radius and the patient age, head circumference, and brain volume 
was found. To date, no established nomogram exist for the assessment of OPG; 
however, according to the World Health Organization osteopenia is diagnosed if 
the T score is < 1 standard deviation (cr) from the mean of healthy population, 
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Fig. 4: Quantitative comparison of PAScAL with traditional ASM and partitioned 
ASM method presented by Yang et al. [4]. 
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osteoprosis is defined as < 2.5a from the mean jlfll . Adopting similar approach, 
we define the detection of OPG in the optic nerve if the mean radius > 2.5<r 
from the mean of healthy population. Based on the adopted criteria, all 21 cases 
(15 healthy + 6 OPG cases) were classified with accuracy demonstrating the 
PAScAL to automatically detect pathologies of the optic nerve. 

4 Conclusion 

We presented an automated technique, PAScAL, for the segmentation of ante¬ 
rior visual pathway from MRI scans of the brain based on partitioned shape 
models with sparse appearance learning. Our work addresses the challenge of 
segmenting cranial nerve pathways with shape and appearance variations due 
to unpredictable pathological changes. Experiments conducted using 21 T1 MRI 
scans, containing instances of both healthy and pathological cases, demonstrated 
superior performance of PAScAL over existing approaches. The application of 
PAScAL in segmenting anterior visual pathway shows its potential in analyzing 
other long and thin anatomical structures with pathologies. 
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